Delphi method for medical coding

ABSTRACT

Systems and methods provide confirmation of accuracy of determining medical codes for medical records. The technology may include processor control instructions or steps for establishing a collection of medical documents as a test sample and for determining a convergence of assigned medical codes for the sample that have been assigned by a plurality of coders to establish a standard of one or more accepted medical codes for the sample. The technology may include instructions for applying the sample to a coding system to obtain a determined medical code for the sample; and for comparing the determined code to the accepted medical codes to rate the coding system&#39;s accuracy. In some embodiments, the determined medical codes may be obtained by a software coding algorithm that automatically assigns medical codes to medical records. Moreover, the comparing step may be performed by an algorithm that automatically calculates a rate of the coding system.

CROSS-REFERENCES TO RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 14/169,507, filed Jan. 31, 2014, which is a divisional of U.S.patent application Ser. No. 12/519,899, filed Oct. 20, 2009, now U.S.Pat. No. 8,676,605, which application is a national phase entry under 35U.S.C. §371 of International Application No. PCT/US2007/026094 filedDec. 20, 2007, which claims the benefit of the filing date of U.S.Provisional Patent Application No. 60/919,076 filed Mar. 20, 2007 andU.S. Provisional Patent Application No. 60/876,056 filed Dec. 20, 2006,the disclosures of which are hereby incorporated herein by reference.

BACKGROUND

Applying medical codes to documentation generated in a hospital orphysician's office requires trained practitioners to apply a complicatedset of rules to determine which of the thousands of ICD9 and CPT codesapply to any given patient during a particular encounter. Althoughmedical coders are trained and certified to perform this function, thereis a great deal of variance between coders in how they perform this task(Department of Veteran Affairs, 1993; Morris, Heinze, Warner, et al.,2000; Morsch, Heinze, & Byrd, 2004).

Coder variability has an obvious impact on the quality of hospitalcoding. Beyond this, coder variability makes it difficult to establish astandard by which coders and computer assisted coding systems (CACs) canbe evaluated. This need for a “gold standard” is now an acknowledgedneed in the coding industry (Morris, Heinze, Warner, et al., 2000;Resnik, Nossal, Schnitzer et al., 2006). Because such a standard willnecessarily be based on human judgment, a process needs to be created bywhich the variant products of human coders can be transformed into aconsensus view for any given set of medical documentation. That is,there needs to be a process by which coders who may initially disagreeon which codes should be applied to a given set of documents, can cometo an agreed upon consensus on how these documents should be coded.Artificial Medical Intelligence (AMI) has developed, EMscribe GS, aconsensus building process for this purpose by adapting some of thetechniques used in the Delphi Method to the medical coding problem.

One of the broader definitions of the Delphi Method is as follows:

-   -   “Delphi may be characterized as a method for structuring a group        communication process so that the process is effective in        allowing a group of individuals, as a whole, to deal with a        complex problem.” (Linstone & Turoff, 2002).

The defining characteristics of the Delphi method include:

-   -   Receiving input from a variety of experts about a topic of        interest, typically anonymously.    -   Obtaining this input in a structured way (e.g. a questionnaire,        an opinion on a defined problem, a set of rating scales).    -   Evaluation of the input by using a set of criteria, and        filtering and summarizing it if necessary.    -   Presenting this evaluation to the experts again and giving them        an opportunity to comment on it and change their input based on        the evaluation.    -   Evaluating this second round of input and representing this        second evaluation to the experts.    -   Iteratively repeating the process until the opinions of the        experts are stable and, in some instances, have converged on a        consensus opinion (Linestone & Turoff, 2002).

Since its development in the 1950s at the RAND corporation, the Delphimethod has been used for a wide range of applications including:

1. Development of policy related to resource management and drug abuse

2. Project estimation

3. Risk analysis

4. Technology projections and

5. Trend analysis (Linestone & Turoff, 2002).

To date it has not been used for the more structured task of medicalcoding.

SUMMARY

An example of the present technology includes methods for confirmingaccuracy of coders. The method may include establishing a collection ofmedical documents as a test sample; determining a convergence ofassigned medical codes for the test sample that have been assigned by aplurality of coders to establish a standard of one or more acceptedmedical codes for the test sample; applying the test sample to a codingsystem to obtain one or more determined medical codes for the testsample; and comparing the determined codes to the accepted medical codesto rate the accuracy of the coding system. In some embodiments, thedetermined medical codes may be obtained by a software coding algorithmthat automatically assigns medical codes to medical records. Moreover,the comparing step may be performed by a software algorithm thatautomatically calculates a rate of the coding system.

In some embodiments, the plurality of coders may include at least oneautomated coding system. Moreover, the determined medical codes may beobtained from human coders through a computer system that accepts datamedical codes assigned with a user interface of the computer system. Inaddition, the plurality of coders may optionally include at least one ormore automated coding system(s) that accepts data medical codes assignedwith a user interface of the coding system. In another embodiment of thetechnology, the plurality of coders includes at least one automatedcoding system that includes a software algorithm that automaticallyassigns medical codes to the collection of medical records.

In other embodiments of the technology, a method for coding medicaldocuments may include presenting a set of documents to each of aplurality of game contestants in a user interface. The method mayfurther include receiving input from at least one of the user interfacesconcerning a potential classification code for the set of documents. Ascore may then be determined for at least one of the plurality of gamecontestants based on the received input from the user interface. In oneembodiment, the method may further involve determining an accepted setof classification codes for the set of documents from input of the userinterfaces of the plurality of game contestants. Moreover, the methodmay also include determining the score based on the determination of theaccepted classification codes.

At least one embodiment of the technology involves an automated systemfor confirming accuracy of coders. The system may include processorcontrol instructions to establish a collection of electronic medicaldocuments as a test sample and processor control instructions todetermine in a computer processor a convergence of assigned medicalcodes for the test sample that have been assigned by a plurality ofcoders to establish a standard of one or more accepted medical codes forthe test sample. The system may further include processor controlinstructions to apply the test sample to a coding system to obtain oneor more determined medical codes for the test sample and processorcontrol instructions to compare in a computer processor the determinedcodes to the accepted medical codes to rate the accuracy of the codingsystem. In an embodiment, the determined medical codes may be obtainedby a software coding algorithm that automatically assigns medical codesto medical records. In another embodiment, the comparing is performed bya software algorithm that automatically calculates a rate of the codingsystem.

In another embodiment, the plurality of coders of the system may includeat least one automated coding system. Moreover, the determined medicalcodes may be obtained by a computer system that accepts data medicalcodes assigned with a user interface of the computer system.Furthermore, the plurality of coders may include at least one or allautomated coding systems that accept data medical codes assigned with auser interface of the coding system. In a still further embodiment, theplurality of coders may include at least one automated coding systemthat includes a software algorithm that automatically assigns medicalcodes to the collection of medical records.

In another embodiment of the technology, an automated system for codingmedical documents may include user interfaces to present a set ofelectronic medical documents to each of a plurality of game contestants.The system may further include processor control instructions programmedto receive input from at least one of the user interfaces concerning apotential classification code for the set of electronic medicaldocuments, and processor control instructions to determine a score forat least one of the plurality of game contestants based on the receivedinput from the user interface. In an embodiment, the automated systemmay also include processor control instructions to determine an acceptedclassification code for the set of electronic medical documents frominput of the user interfaces of the plurality of game contestants.Moreover, the automated system may further include processor controlinstructions to determine the score based on the determination of theaccepted classification code.

Additional aspects of the technology will be apparent from a review ofthe following description and drawings.

BRIEF DESCRIPTION OF DRAWINGS

The present technology is illustrated by way of example, and not by wayof limitation, in the figures of the accompanying drawings, in whichlike reference numerals refer to similar elements including:

FIG. 1 is a block diagram of an example embodiment of a method ofassigning medical codes of the present technology;

FIG. 2 shows an embodiment of a user interface display when assigningcodes during a gaming embodiment of the present technology;

FIG. 3 illustrates an embodiment of a generic user interface displaywhen voting for codes during a gaming embodiment of the presenttechnology;

FIG. 4 shows an embodiment of a user interface display when assigningcodes during a round one of a gaming embodiment of the presenttechnology; and

FIG. 5 shows an embodiment of a user interface display when voting oncodes during a round two of a gaming embodiment of the presenttechnology.

DETAILED DESCRIPTION

We have observed that the task of medical coding exhibitscharacteristics that we believe make it a candidate for the Delphiapproach including the following attributes: include:

-   -   1. Medical coding is a complicated task    -   2. There is a lack of a generally agreed upon “solution” (set of        codes) that could be applied to any set of documents.

Applying the Delphi Method to the general day to day task of codingmedical documents would be too expensive to implement. However, we haveapplied it for the more occasional task of creating a “gold standard” bywhich an organization could evaluate coders or computer assisted codingsolutions. AMI's Delphi Method Of Medical Coding (DMMC) incorporated inEMscribe GS is designed to do just these types of tasks.

The Delphi Method Of Medical Coding (DMMC) utilized in EMscribe GS isbased on the premise that collective expertise of a group of experiencedcoders will result in more accurately coded documentation than (for themost part) the application of that expertise by any individual coder.Coders will correct errors made by others in the group, they will thinkof things that may escape the attention of any individual coder, theywill check each others work, etc. As a result, the consensus view willbe more accurate and more complete than the product of any individualcoder's efforts.

One approach using the DMCC process employed by EMscribe GS isillustrated by the following methodology, which may be implementedthrough the use of automated or computerized systems in the performanceof the methodology (in whole or in part) depending on the nature of theparticular characteristics as discussed below:

-   -   1. A test set of medical documentation is assembled.    -   2. Coders are recruited.    -   3. Coders are presented with individual items from the test set        and they then apply codes to these items. These codes are then        referred to as the Round One results.    -   4. The Round One results are then analyzed. Codes for which        agreement has been reached at a level above some criterion level        are identified. Other codes are presented to the coders for a        second round of evaluation.    -   5. Coders then examine the codes for which there is no        consensus. Written statements justifying their application to        the documentation or arguments why they shouldn't be applied to        the documentation are generated by the coders.    -   6. These statements are reviewed by the coders and then for each        nonconsensual code, they decide either to apply the code to the        medical documentation or not. The results of this voting are        referred to as the Round Two results.    -   7. The Round Two results are analyzed as in step 4 above, the        codes for which agreement has been reached at a level above the        criterion level are identified. Other codes for which there is        not a sufficient level of agreement are presented to the coders        for a subsequent third round of evaluation.    -   8. Steps five through seven are repeated until changes between        rounds falls below an agreed upon level.    -   9. At the end of the process, codes for which no consensus is        reached are arbitrarily chosen to be assigned or not assigned to        the documentation based on a random selection criterion. The        resulting set of codes is referred to as the consensus view for        the associated set of medical documentation.

In the following sections, example steps of such a methodology aredescribed in more detail.

Assembling Medical Documentation for a Test Set

The composition of a test set preferably depends on the evaluation goalsof the gold standard and the domain of interest. Medical coding isapplied to a variety of medical documents including H&P reports,consultation notes, procedure notes, in-office physician notes, anddischarge summaries. The documents assembled for the test set shouldreflect the types of documents typically encountered by the coders whowill construct the standard and who will be evaluated once the standardis constructed.

Two typical ways to characterize medical documents are by the type ofmedical encounter and by the patient's diagnosis. Documents can also becharacterized by the kind of person generating them, the completeness ofthe documentation, demographic patient information, etc.

Once a set of relevant characteristics for the medical documents havebeen identified, each combination of characteristics that can occurshould be listed (e.g. documents generated during an initial encounterfor patients exhibiting symptoms of an acute myocardial infarction,documents generated during a subsequent encounter for patients diagnosedwith diabetes, etc.). Each of these combinations constitutes a samplingcell.

Documents are collected that fit the characteristics of each cell. Thenumber of documents collected for each cell depends on the variabilityof the documents within the cell (greater variability necessitates alarger cell sample). Beyond this, sampling can be done in two ways:

-   -   1. Sampling can be exhaustive. Under this scheme, documents are        collected for every single cell. The set then becomes a        comprehensive evaluation of documents that fit some defined set        of criteria.    -   2. Sampling can be representative. Under this scheme, documents        are collected in proportions reflective of the population from        which the sample is derived. For example, if 10 percent of the        patients in a particular setting are cardiac patients, 10        percent of the documents in the test set are from cardiac        patients.

Recruiting Coders

The coders participating in the DMMC should have experience with thetypes of documents and types of medical coding for which the standard isbeing created. Beyond establishing a set of criteria which define theminimum qualifications for a participating coder, little else needs tobe done. Restricting participation to only the “best” coders in anorganization (as say identified by a supervisor) may not be advisablesince the ability to cooperate and achieve a common understanding of howto code a document is at least as important in this task as knowledge ofcoding.

Presentation Of Individual Test Items—Round One

An important characteristic of any Delphi method is that the identity ofexperts remains hidden from other experts (although it is known to themoderator). A convenient way to hide the identity of the experts but tostill allow them to be distinguished within the Delphi environment is toassign each coder a “handle” or nickname. Each coder's input duringRound One is tagged with the coder's handle. Coder's comments on theinput are also tagged.

Coders provide their input using a standardized form (either electronicor paper). Coders should provide this input isolated from others who maybe also participating in the DMMC. When coders are finished providinginput, it is submitted to the moderator.

Analyzing Round One Results

The main task during this step is to identify those codes for whichthere is sufficient convergence of opinion so that they can beeliminated from further consideration. To do this, a criterion needs tobe established. This can be set as a percentage of agreement amongtesters (e.g., 90% of the coders agree that code X should be applied tothe chart). The remaining codes are resubmitted to the coders for roundtwo.

Reevaluating Codes for which there is No Consensus—Round Two

Round Two consists of two stages. During the first stage, expertsreexamine the codes for which there was no consensus. For each of thesecodes, they can either write statements in support of or againstapplying the code to the associated set of documentation. The commentsare read and responded to by all the participating experts. After adesignated period of time, comments are suspended and the second stageof Round Two occurs when the experts revote on each code for which thereis no consensus.

Analyzing Round Two Results

Analysis at this step is the same as it was during Round One. The samecriterion is used to define a converged opinion on whether a code shouldbe coded or not.

Additional Rounds

Progress towards convergence can be measured by using a change score, C,defined as the ratio of the total number of codes for which convergencehas been reached during the current round over the total number of codesconsidered before the round began. For example, if there were 10 codesfor which there was no convergence prior to the beginning of the roundand four codes for which there was convergence at the end of the round,the change score would be 4/10 or 0.4. Rounds are continued until thechange score falls below a certain criterion amount.

Assignment of the Remaining Codes

Once the change score has fallen below criterion, decision of whether ornot to assign the remaining codes is determined by a random process.This can be done using an unbiased random number generator associatedwith a rule governed process (e.g. odd numbers mean assign and evennumbers mean don't assign).

Use of Software Tools to Facilitate DMMC

Although use of a computer system is not necessary to implement DMMC,using at least a software tool like, for example, EMscribe GS has anumber of advantages including:

-   -   1. Organizing the presentation of the test set information. A        chart or document can be presented electronically to experts.        Experts can easily page between different parts of the        documentation if there are multiple documents in this scheme.    -   2. Coding is integrated with the documentation presentation.        Experts can assign codes to the documentation in an integrated        environment that allows the coder's responses to be easily        associated with the relevant documentation.    -   3. Scoring is automated. A computer system may then be        implemented to compute which codes have a sufficient level of        convergence and then can package the remaining codes so that        they can be represented in future rounds.    -   4. Managing expert discussions around codes during individual        rounds. A system can be implemented to receive, store and        organize statements about codes using discussion threads similar        to what is found on internet discussion boards.    -   5. Managing the voting on codes during rounds. A computer system        may then be implemented to automatically tabulate votes and        determine through this voting process whether a convergence of        opinion has been formed around each code.

Artificial Medical Intelligence has designed EMscribe GS to implementaspects of these features.

DMMC Outputs

DMMC produces two outputs that can be used for evaluation purposes:

Medical Record Test Set

-   -   If done repeatedly, this process will generate a corpus of        accurately coded medical records. These medical records can then        be used to evaluate coders. To evaluate these new coders, they        are simply asked to code the documents for which consensus        coding has been obtained. Then the degree of correspondence        between these coders and the consensus view can be compared. A        score measuring the accuracy of these coders can be obtained        employing the commonly used industry metrics of recall and        precision. Appendix one includes an explanation of these two        metrics.

Coder Evaluation Test

-   -   The process can also be used to evaluate the coders taking part        in the DMMC process. Coders are evaluated based on their initial        round of coding relative to the eventual consensus view. That        is, for each coder participating it is possible to ask, what        percentage of codes generated during the initial round were also        codes that appeared in the consensus view and how much        overcoding did each coder do relative to the consensus view.        Recall and precision statistics can be generated for these        coders in the same manner as above.

For example, in an embodiment of such a method as illustrated in FIG. 1,that may be implemented as a system, such as with software of a computersystem, for confirming accuracy of coders (either people and/orautomated coding systems such as an automated coding system described inU.S. patent application Ser. No. 11/106,817, filed on Apr. 15, 2005, theentire disclosure of which is incorporated herein by reference), themethod may include one or more of the following: establishing acollection of medical documents as a test sample; determining aconvergence of assigned medical codes for the test sample that have beenassigned by a plurality of coders to establish a standard of one or moreaccepted medical codes for the test sample; applying the test sample toa coder or coding system to obtain one or more determined medical codesfor the test sample; comparing the determined codes to the acceptedmedical codes to evaluate or rate the accuracy of the coder or codingsystem.

Another embodiment of the technology may be implemented with anapplication of game techniques for medical coding. In addition to theneed for a gold standard, there is a need within the medical codingcommunity to provide a quantitative and objective means for evaluatingmedical coders. Such a measurement instrument could be used by hospitalsand other institutions that hire medical coders to evaluate potentialcandidates for medical coding positions and determine compensation forexisting medical coding staff. Artificial Medical Intelligence (AMI) hasdeveloped, the Coding Game, an online game for the medical codingcommunity that produces both a medical coding Gold Standard and aquantitative evaluation for any medical coder who plays the game.

Applying the Delphi Method to the typical daily task of coding medicaldocuments could be too expensive. However, implementing the DelphiMethod as an interactive game in which Medical Coders could compete,potentially win prizes and receive a ranking provides significantincentives to the participants, to the extent that coders are likely toparticipate for free and may even pay a subscription fee in order to beable to play.

The Coding Game as an Instance of Human Computation

Human based computation is a technique whereby a computational processis performed partially by a computer and partially by one or morehumans. In human computation, a problem is presented to multipleindividuals to solve. A computer then collects, interprets andintegrates the information provided by these individuals to produce asolution (Wikipedia, 2007). The use of gaming techniques to provideincentives for individuals to participate in human computationactivities has been used in other contexts such as labeling pictures(von Ahn, 2003) and collecting common-sense facts (von Ahn, 2006). Bytaking the standard Delphi technique described above and modifying it,Medical Coding can be transformed into a competitive game that willproduce both a gold standard for medical records and a quantitativescore for each participant in the game, a score that represents thatindividual's coding ability.

Implementing the Coding Game: Methodology

The coding game is played using a database of de-identified and/orfictitious medical records. A medical record consists of a series ofdocuments that would be generated by medical personnel during anypatient encounter. Examples of documents in a medical record include theHistory and Physical, Operative Report, Consultation Note, DischargeSummary, etc.

A key aspect of the game is the method used to determine if theassignment of a given code assigned to a medical record or document isvalid or not. The Coding Game adopts a standard derived from the DMMCcited earlier. The assignment of a medical code to a medical document ormedical record will be considered valid if a certain predeterminedpercentage of participants (called the criterion percentage), coding thesame document or record, assign the same code. For example, given an 80%criterion percentage, if 80% or more of the coders evaluating a givenmedical document determine that a certain code should be appropriatelyassigned to that document that code would be considered a validassignment. Codes meeting this criterion will be known as consensuscodes.

For the purposes of implementing the Coding Game, both the medicalrecords and their individual component documents grouped into cases areclassified into three categories:

-   -   1. Medical Records Or Documents For Which Complete Consensus Has        Been Obtained. These records are associated with consensus codes        and only consensus codes. Records falling into this category        will be known as Complete Consensus Records (CCRs).    -   2. Medical Records Or Documents For Which Incomplete Consensus        Has Been Obtained. These are records are associated with medical        codes, at least some of which, are not consensus codes. Records        falling into this category will be known as Incomplete Consensus        Records (ICRs).    -   3. Medical Records Or Documents That Are Not Yet Associated With        Medical Codes. These are medical records that have not yet been        coded. Records falling into this category will be known as        Uncoded Records (URs).

A game consisting of a series of cases is played by a group of coders,all of whom are asked to assign medical codes to the same collection ofcases. Each case may consist of a series of component documentscomprising a medical record or may be independent medical documents. Anyestablished medical coding system can be used when playing this game(e.g. ICD-9, ICD-10, CPT, SNOMED, etc.). During each case, participantsare exposed to a complete medical record or a single medical document(e.g. a History and Physical). Participants then respond in one of twoways:

-   -   1. For CCRs or URs, they assign one or more medical codes to the        document or medical record. Assignments may be done through a        web-based user interface as shown in FIG. 2, which shows an        embodiment of a user interface display when assigning codes.        Similarly, FIG. 4 shows another embodiment of a user interface        display when assigning codes during a round one of the game        embodiment.    -   2. For ICRs, they vote on whether one or more medical codes        should be assigned to a medical record or document. When voting,        participants respond “yes” or “no” for each of the nonconsensus        codes presented. Voting occurs for each code assigned to a        medical record or document. Voting may also occur through a web        based user interface as shown in FIG. 3, which shows an        embodiment of a generic user interface display when voting.        Similarly, FIG. 5 shows another embodiment of a user interface        display when voting on codes during a round two of a game        embodiment.

Presenting participants with a mix of cases, some associated withconsensus codes, some with incomplete or nonconsensus codes serves twoneeds of the game:

-   -   1. Keeping participants engaged by presenting them with        immediate feedback on their submissions.    -   2. Expanding the body of medical records associated with        consensus codes, CCRs. The ratio of cases using CCRs to non-CCRs        (URs and ICRs) is not fixed but is determined by the number of        participants in the game. However, the number of CCR cases        should be greater than the number of non-CCR cases to maintain        participant interest.

Scoring

Participants receive a score based on their correct responses. A scoreis provided immediately during cases in which participants are codingrecords classified as CCRs. A score is provided after some delay whenparticipants are coding or voting on records classified as either URs orICRs. The length of the delay in scoring a UR or ICR cases is dependenton the amount of time it takes a criterion percentage of coders to agreeon a set of codes to be associated with document or medical record usedin the case and is wholly determined by the number of participants andthe frequency with which they play the game. If the criterion percentageof coders is 80%, then when 80% of the coders have agreed with the codesassigned to a medical record or document used in a UR or ICR case, allparticipants who have coded these documents will receive positive ornegative feedback depending on whether they assigned the consensus code.Under conditions in which many individuals are playing the game at thesame time, feedback could be almost immediate. With lighter traffic thedelay will be greater and some results might first become available thenext time the coder player logs onto the site increasing the desire torepeatedly participate in the game.

Scoring will be based on two standard industry metrics, precision andrecall. Both measures rely on the existence of something that represents“truth”, in this case, the consensus codes associated with the medicalrecord or document. A coder can either:

-   -   Assign a code, when the code is also a consensus code (this is        called a true positive or tp).    -   Not assign a code, when the code is not a consensus code (this        is called a true negative or tn).    -   Not assign a code when the code is a consensus code (this is        called a false negative or fn).    -   Assign a code when the code is not part a consensus code (this        is called a false positive or fp).

These four possibilities are summarized in the following table A-1(after Manning & Schutze, 1999).

A-1 Consensual Code Coder (Human) Code Assigned Code Not Assigned CodeAssigned tp fp Code Not Assigned fn to

Precision is then defined as:

${precision} = \frac{tp}{{tp} + {fp}}$

and recall is defined as:

${recall} = \frac{tp}{{tp} + {fn}}$

The precision of a participant is simply the percentage of the itemscoded that were also consensus codes. The recall of the participant isthe percentage consensus codes that the participant also coded.

The two measures can be combined into a measure called the F Measure(Manning & Schutze, 1999) which is defined as

$F = \frac{1}{{\alpha \left( {1/P} \right)} + {\left( {1 - \alpha} \right){1/R}}}$

where P is precision, R is recall and a is a weight that determines therelative importance of precision and recall (typically this is set to0.5). F statistics normally range from 0 to 1. To present a score thatis easier to read, the F statistic will be multiplied by 100 whenpresented to a user.

Game Rounds

Individual games are structured into rounds.

-   -   Level 1: consists of participants who are new to the game or who        have not had one of the top P scores in previous level 1 rounds        (where P is a percentage assigned by the implementers of the        game).    -   Level 2: consists of participants who obtained one of the P top        scores in the most recent level one game played.    -   Level 3: consists of participants who had one of the P top        scores in the most recent level 2 game played.

Additional rounds will be patterned like this with the Nth level roundconsisting of participants who had one of the P top scores in the mostrecent N−1 level played. This embodiment of the Coding Game does nothave a fixed number of rounds. The number of rounds is wholly determinedby the number of participants registered for the game.

Rankings

Participants playing the Coding Game will receive an overall score thatreflects their performance within individual rounds and the round levelthat they have achieved. This overall score is determined by thefollowing formula:

Score=10Σμ_(x/n)

Where μ_(x) is the average score a participant receives in a givenround, n is the number of rounds, and x ranges from 1 to the number ofrounds. This overall score will range from 0 to 1000. It reflects boththe performance within each round as well as the level of roundachieved. In a game with ten levels, individuals who have reached levelfour can have an overall score as high as 400. Those who have reachedlevel 7 can have an overall score as high as 700, etc. Rankings will bedetermined by ordering scores highest to lowest.

Coding Game Products

The coding game yields several products that are of general use to themedical community. Specifically:

-   -   1. The Coding Game produces a corpus of medical records with        associated medical codes that can serve as a gold standard for        medical coding. The codes are the product of a process that        drives participants towards consensus. Unlike other methods that        might produce a gold standard, the Coding Game produces        empirical data that justifies the assignment of each code. The        consensus percentage represents a precise measure of the degree        of agreement within the coding community that any particular        code should be associated with any particular document.        -   a. The gold standard itself has numerous uses including the            screening of potential job applicants at medical            institutions that do coding,        -   b. the training of individuals for coding positions,        -   c. the evaluation of various computer assisted coding            software systems.    -   2. The Coding Game provides a precise measure of a human coder's        ability. The scoring system that is part of the Coding Game can        be used as a measure of this ability. The scale which ranges        from 0 to 1000 is a much more precise metric than the current        system that relies on inexact correlates to coding ability such        as years of experience and passing a certification program.        -   a. This metric is useful both to hiring institutions such as            hospitals as well as to the coding community itself.        -   b. The metric will provide greater transparency in the            market place for medical coders allowing the best coders to            command a premium for their services while institutions that            hire medical coders can make intelligent tradeoffs between            compensation and ability.    -   3. The Coding Game is a vehicle for training coders. Because it        has been constructed to be enjoyable, coders will have an        incentive to participate in the game. Through their        participation, they will gain valuable experience and feedback        on their coding ability. Like many pedagogic systems that        provide experience and feedback, training is a natural        consequence of this process.

Using the Delphi Method in a Game Setting for Other Content Domains

The methods and scoring algorithms described in this patent applicationare not restricted to Medical Coding. They can be applied to any domainthat has the following characteristics:

-   -   1. There is significant variance (disagreement) among identified        experts in the domain when asked to solve domain specific        problems. Medical coders have a well documented history of        variability of response when asked to code the same documents.        However, the field of medical coding is not unique in this        regard. A few examples from other domains include: medical        diagnoses, legal precedent analysis and opinion, tax and        accounting practices and property appraisal.    -   2. The potential responses that experts can provide are limited        and objectively describable a priori. Like medical codes, the        expert responses of an eligible domain for this kind of        treatment must be easily described. For example a physician,        rendering an opinion about a medical diagnosis is easily        describable because medical diagnoses are limited and are well        documented. An opinion rendered by an accountant about the        legitimacy of writing off a particular deduction also falls into        this category.

The above described embodiments may be programmed as software forcomputerized systems, such as PDA's, portable computers, desktopcomputers, networked computers or devices that can interact with serversand/or other networked devices over a network such as the Internet. Forexample in one embodiment, a method for coding medical documents using adigital processor may include presenting a set of documents to each of aplurality of game contestants in a user interface; receiving input fromat least one of the user interfaces concerning a potentialclassification code for the set of documents; and determining a scorefor at least one of the plurality of game contestants based on thereceived input from the user interface. This method as previouslydescribed may further include determining an accepted classificationcode for the set of documents from input of the user interfaces of theplurality of game contestants. Moreover, this method as previouslydescribed may further include determining the score based on thedetermination of the accepted classification code.

Although the technology herein has been described with reference toparticular embodiments, it is to be understood that these embodimentsare merely illustrative of the principles and applications of thetechnology. It is therefore to be understood that numerous modificationsmay be made to the illustrative embodiments and that other arrangementsmay be devised without departing from the spirit and scope of thetechnology.

APPENDIX 1 An Explanation of Sensitivity and Recall Measures

A computerized system as discussed previously may be implemented tomeasure or determine the accuracy of Coders and Computer Assisted Coding(CAC) with two standard industry metrics, precision and recall. Bothmeasures rely on the existence of entity which represents truth, in thiscase, the consensual coding (gold) standard that is the product of theDMMC process. A coder, viewing the same set of medical records that arepart of the gold standard either can:

-   -   Assign a code, when the code is also part of the consensual view        (this is called a true positive or tp).    -   Not assign a code, when the code is not part of the consensual        view (this is called a true negative or tn).    -   Not assign a code when the code is part of the consensual view        (this is called a false negative or fn).    -   Assign a code when the code is not part of the consensual view        (this is called a false positive or fp).        These four possibilities are summarized in table A-1 (after        Manning & Schutze, 1999).

Coder (machine Consensual View (Gold Standard) or Human) Code AssignedCode Not Assigned Code Assigned tp fp Code Not Assigned fn to

Precision is then defined as:

${precision} = \frac{tp}{{tp} + {fp}}$

and recall is defined as:

${recall} = \frac{tp}{{tp} + {fn}}$

The precision of a coding system is simply the percentage of the itemscoded that were also part of the gold standard, the consensual view. Therecall of the system is the percentage of gold standard items that thecoding system also coded.

The two measures can be combined into a measure called the F Measure(Manning & Schutze, 1999) which is defined as:

$F = \frac{1}{{\alpha \frac{1}{p}} + {\left( {1 - \alpha} \right)\frac{1}{R}}}$

where P is precision, R is recall and a is a weight that determines therelative importance of precision and recall (typically this is set to0.5).

1. A method for coding medical documents comprising: presenting a set ofdocuments to each of a plurality of game contestants in a userinterface; receiving input from at least one of the user interfacesconcerning a potential classification code for the set of documents; anddetermining a score for at least one of the plurality of gamecontestants based on the received input from the user interface.
 2. Themethod of claim 1, further comprising determining an acceptedclassification code for the set of documents from input of the userinterfaces of the plurality of game contestants.
 3. The method of claim2, further comprising determining the score based on the determinationof the accepted classification code.
 4. An automated system for codingmedical documents comprising: user interfaces to present a set ofelectronic medical documents to each of a plurality of game contestants;and processor control instructions programmed to receive input from atleast one of the user interfaces concerning a potential classificationcode for the set of electronic medical documents; and processor controlinstructions to determine a score for at least one of the plurality ofgame contestants based on the received input from the user interface. 5.The automated system of claim 4, further comprising processor controlinstructions to determine an accepted classification code for the set ofelectronic medical documents from input of the user interfaces of theplurality of game contestants.
 6. The automated system of claim 5,further comprising processor control instructions to determine the scorebased on the determination of the accepted classification code.